Published online 4 June 2012 Nucleic Acids Research, 2012, Vol. 40, Web Server issue W415-W422 

doi:10.1093/narlgks515 

Quantum. Ligand. Dock: protein-ligand docking with 
quantum entanglement refinement on a GPU system 

Alexander A. Kantardjiev* 

Biophysical Chemistry Group, Institute of Organic Chemistry, Bulgarian Academy of Sciences, Sofia 1113, 
Bulgaria 



Received March 12, 2012; Revised IVlay 8, 2012; Accepted IVlay 9, 2012 



ABSTRACT 

Quantum. Ligand.Dock (protein-ligand docking with 
graphic processing unit (GPU) quantum entangle- 
ment refinement on a GPU system) is an original 
modern method for in silico prediction of protein- 
ligand interactions via high-performance docking 
code. The main flavour of our approach is a combin- 
ation of fast search with a special account for 
overlooked physical interactions. On the one hand, 
we take care of self-consistency and proton equi- 
libria mutual effects of docking partners. On the 
other hand, Quantum.Ligand.Dock is the the only 
docking server offering such a subtle supplement 
to protein docking algorithms as quantum entangle- 
ment contributions. The motivation for development 
and proposition of the method to the community 
hinges upon two arguments— the fundamental im- 
portance of quantum entanglement contribution in 
molecular interaction and the realistic possibility to 
implement it by the availability of supercomputing 
power. The implementation of sophisticated quan- 
tum methods is made possible by parallelization at 
several bottlenecks on a GPU supercomputer. The 
high-performance implementation will be of use for 
large-scale virtual screening projects, structural 
bioinformatics, systems biology and fundamental 
research in understanding protein-ligand recogni- 
tion. The design of the interface is focused on feasi- 
bility and ease of use. Protein and ligand molecule 
structures are supposed to be submitted as atomic 
coordinate files in PDB format. A customization 
section is offered for addition of user-specified 
charges, extra ionogenic groups with intrinsic pKg 
values or fixed ions. Final predicted complexes are 
ranked according to obtained scores and provided 
in PDB format as well as interactive visualization in a 
molecular viewer. Quantum.Ligand.Dock server can 



be accessed at http://87.116.85.141/LigandDock 
.html. 



INTRODUCTION 

Understanding protein-ligand interactions is a major 
focus for modern molecular biophysics and structural bio- 
informatics research. On the practical side, application of 
drug design techniques requires the availabihty of fast and 
rehable docking methods that can account for all major 
aspects of molecule interaction physics. Despite the 
progress in prediction via in silico methods, intricacies in 
protein-ligand interactions are still beyond our reach 
(1-3). The introduction of Fourier correlation methods 
(4) brought reasonable speed of algorithms for rigid- 
body docking. Graphic processing unit (GPU) supercom- 
puter systems provided additional breakthrough in this 
class of molecular modeling techniques (5). Thus, the 
crucial next step is to focus on the precise description of 
the physics of protein-ligand interactions. The most 
rehable description is via ab initio quantum mechanical 
methods, and the recent possibilities to access adequate 
computing power obliges the community to address the 
problem in the context of practical protein-ligand 
analysis tools. Another issue is the treatment of long-range 
electrostatics and protonation states (6-10). Modern 
docking algorithms are expected to treat self-consistency 
of long-range interactions and the mutual effect of the 
protein and ligand molecules on each other protonation 
state. In this respect, we have already contributed in 
the case of protein-protein docking and now apply this 
concept in protein-small molecule interaction case though 
with a novel advanced high-performance implementation. 

Prediction of protein-protein and protein-ligand inter- 
actions via docking methods is at the focus of intense 
research (11-22). An essential step of any docking 
workflow is to find a hst of ranked mutual orientations 
based on a scoring measure for shape complementarity 
and long-range interactions (electrostatics). The 
methods implementing rigid-body dock borrow ideas 



*To whom correspondence should be addressed. Tel: +359 2 9606 123; Fax: +359 2 986 27 95; Email: alexander.kantardjiev@gmail.com; 
alexkant(a; orgchm.bas.bg 

© The Author(s) 2012. Published by Oxford University Press. 

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecomnions.org/licenses/ 
by-nc/3.0). which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. 



W416 Nucleic Acids Research, 2012, Vol. 40, Web Server issue 



from protein-protein docking approaches such as 
the popular ZDOCK (11), Hex (12), PIPER (13) and 
GRAMM-X (14). The first rigid body docking program 
based on fast Fourier transformation is the pioneering 
DOT application (15). 

A subsequent step is aimed at refinement of rigid 
docking results by taking into account short-range 
interactions. A precise treatment requires account 
for backbone and side chain flexibility (16) — e.g. 
RosettaDock (17) and HadDock (18). Specific popular 
applications for protein-ligand docking that dominate 
the field are AutoDock (20) and SwissDock (21). An al- 
ternative idea for docking is the search for analogy in 
known protein-ligand interfaces reminiscent of the 
protein-protein docking as implemented in PRISM (22). 

However, all these methods do not face two issues — 
quantum effects and the self-consistency of electrostatic 
interactions (including the mutual influence of docking 
partners on their protonation states through interdepend- 
ent perturbation of pA^^ values). 

Our contribution is the implementation of this essential 
but missing link in the context of protein-ligand inter- 
actions and its realization on a massively parallel GPU 
supercomputer via C/C++/OpenCL programming envir- 
onment. Thus, we have developed ultrafast docking code 
with a strong potential for large-scale systems biology 
projects. Concurrently, we have put on a sound theoretical 
basis the interdependency of protein-hgand electric fields, 
the mutual influence on pA^^ values (ionization states) 
upon molecule encounter and the fundamentally import- 
ant quantum entanglement effect. 

On the docking algorithmic side, we make use of the 
significant speedup of the fast Fourier transform (FFT) 
parallelized effectively under OpenCL environment. 
However, the Fourier transform is not used in the spirit 
of the traditional grid-based Katchalski-Katzir algorithm 
(4). We implement a version of 6D correlation search 
which makes use of spherical polar functions (22). It is a 
gridless method implemented via spherical polar Fourier 
representation of docking partners and several ID FFT. 

On the electrostatics side, we apply an improvement 
of our own self-consistent and rigorous method, 
GPU.proton.DOCK/PHEPS/PHEMTO (23-25). Our 
approach to electrostatics is characterized by implementa- 
tion of fast algorithms and methods with reasonable, 
sound physics background that is reliably proven by 
numerous benchmarks — unequivocal vahdation by com- 
parison with experimental studies (NMR and IR data) as 
shown in a number of peer-reviewed publications over the 
years (26-30). The estimation of protein electrostatic po- 
tential distribution is based on the GPU parallelization via 
CUDA kernels — our previous implementation (23) — and 
an implementation of a hierarchical fast multipole method 
(FMM) in OpenCL environment for additional speedup. 
Thus, our intrinsic fast electrostatics becomes ultrafast — 
an essential breakthrough since each samphng step of the 
6D translation-rotation space (5 rotational and 1 transla- 
tional degree of freedom) requires estimation of electro- 
static energies, update of pii^a values and reassignment of 
protonation charges. 



Along with these improvements we implement reason- 
able and practical approach for estimation of a fundamen- 
tal quantum mechanism (quantum entanglement) which is 
emerging as a major topic in modern molecule science but 
still ignored in current docking approaches. Quantum en- 
tanglement contribution to the protein-hgand interaction 
is estimated via calculation of entangled states of the com- 
posite protein-ligand Hilbert space. Technically speaking, 
it is the tensor product H Protein <8> Hugaiui of the Hilbert 
spaces of the protein molecule (Hprotem) and the ligand 
molecule {Hugand)- For example, by using basis vectors 
10) Proteins W Protein f^r the Hilbcrt spacc Hp,otein ^ud basis 
vectors \0)ugand^ W Ligand for the Hilbert space Hugand, we 
can define the following entangled Befl state: 

I*) = ^ {\0)protein'Sl\l)Lignnd-Wprotein<^W Ligand) W 

However, we are not going to theorize in this publica- 
tion (details are given at the Quantum. Ligand. Dock 
Server — the Supplement and Benchmark pages). We aim 
to provide a practical tool including this effect, and this 
article describes the major steps of the implementation 
without delving deep in the theory. The motivation for 
this development and the proposal of the method to the 
community hinges upon two arguments — the importance 
of quantum entanglement contribution in overall protein- 
hgand interaction and the realistic possibihty nowadays to 
implement it by the availability of supercomputing power. 
In this way, we provide for the community a modern 
docking method with practical interface and at the same 
time one that transcends some limitations of other 
docking tools. Note that using this measure does not inter- 
fere or overlap with the classical continuum electrostatics 
(p^a evaluations, including mutual interactions) or steric 
overlap measure. 

MATERIALS AND METHODS 

Molecular recognition factors - The Art of 
Quantum Fugue 

A fascinating dimension of the protein-hgand recognition 
is the inclusion of the quantum entanglement contribu- 
tion. Entanglement is often referred to as a profound 
and important concept in molecule science, but our 
server provides concrete implementation to practical 
protein-hgand docking problems. This issue is timely 
since quantum entanglement is proved to be ubiquitous 
in molecular interactions and there is considerable 
evidence of its robustness in biological systems. 

In molecule physics, there is a relation linking binding 
energy with entanglement measure, and we implement this 
notion in the scoring function. It has been our purpose to 
provide this essential feature for the practicing structural 
bioinformatician and expert computational biophysicist. 
Estimation of binding energy contribution is just one 
side of quantum entanglement evaluation. Of fundamental 
interest is the explanation of the correlation that is respon- 
sible for the energy change upon protein-ligand docking. 
Thus, apphcation of quantum entanglement to the 
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molecule recognition problem seems compelling in itself. 
These calculations have motivic kinship to other issues 
such as the widely discussed and exciting quantum 
non-locahty in molecular systems including biological 
macromolecules. For example, besides the overall 
measure (witness) characterizing entanglement between 
the protein and the Hgand molecule, one can also report 
the so-called connectivity which informs us about the 
quantum correlation range. There is still more cunning 
in this concept but we will restrict our discussion of it in 
terms of the practical application docking service. We are 
not going to delve into details of implementation but just a 
feeling of the accessibility of this measure for practical 
protein-hgand interactions and a notion for the methods 
used to calculate it. The task is to estimate the amount of 
entanglement between two subsystems — the protein 
molecule and the ligand molecule — and measure for esti- 
mation of the amount of quantum entanglement is the 
so-called logarithmic negativity, a quantity derived from 
the eigenvalues of corresponding density matrices as 
well as the Schmidt rank (details are given at the 
Quantum. Ligand. Dock Server — the Supplement and 
Benchmark pages). 

Implementation of major concepts 

In devising our docking scheme, we were supposed to 
think contrapuntally but design the workflow sequentially. 
Four major threads of thought emerge as essential: 
rigid-body fit by shape complementarity, long-range elec- 
trostatics treatment, mutual impact of docking partners' 
ionization states and quantum entanglement contribution. 
We consider our method intrinsically satisfying — reflec- 
tion of a full picture of protein-ligand interaction, 
merging new tendencies with high-performance realization 
of earlier concepts and forging a unique workflow. 
Although we shifted the overall weight to quantum con- 
tributions, let us have a look at the first step rigid-body 
dock with shape complementarity based on FFT. 

Spherical FFT sampling of translation-rotation space 

Whatever level of treatment, a reasonable first step is to 
search for shape complementarity between the protein and 
the ligand molecule. It is a common theme of modern 
docking algorithms to implement Fourier transform- 
based search for rigid-body docking. Briefly, the molecules 
are mapped on grids and then a correlation of the maps 
is calculated via the FFT algorithm. The theoretical 
arguments lie in the convolution theorem. The method 
turned out to be a breakthrough but still poses several 
inconveniences. For example, each samphng step in 
rotation space requires pre-calculation of grids. 
Recently, we have implemented grid-free algorithms 
based on the spherical harmonic functions in C/CUDA 
(23). Gridless (grid-free) representation of the protein 
molecule and the ligand is based on 3D polynomial expan- 
sion of spherical polar basis functions (spherical harmonic 
functions) (14). Then, sampling docking correlations is 
reduced to estimation of coefficient vectors of the 
docking partners. 



The major result, i.e. complementarity, is calculated 
conveniently via a series of ID FFT which are efficiently 
handled for GPU systems: 

FTRdYt) = ^"'-'^ E KyXTR,a,A) X iv;_- («/,/*/) (2) 

z xy 

where the vectors of expansion coefficients for 'receptor' is 
R*^.-{Tr,ar,Pi-) and for 'ligand' molecule is Lxrz{oii,f^i) ■ 
Rotation is via matrix elements of the real Wigner 
rotation matrices. Translation is performed in 
Gauss-Laguerre basis functions (31). 

Just as a reminder the FFT algorithm reduces algo- 
rithmic complexity to log A^. More details on this 
issue is given in our previous publication (23) describing 
this procedure in the context of protein-protein 
docking and its supplement section, including bench- 
mark results. In fact, any interaction potential 
describing physics of molecule recognition can be rep- 
resented via spherical polar functions, and in the next 
section, we describe how to cope with situation of 
long-range electrostatics. 

Although a rigid docking algorithm. Quantum. 
Ligand. Dock gives some flexibility by inclusion of a 
softer scoring function. Hence, some structures seem to 
penetrate each other in visualization mode. 

In resume, a combination of modern day approaches 
solves the problem of the computational complexity in 
sampling protein-hgand search space. Thus, after a 
careful implementation of the above algorithms, we 
have to focus on accuracy of the interactions treatment 
itself. 

Long-range electrostatics 

Adequate treatment of electrostatics interactions is the 
central issue in molecular simulations. This is due to 
their long-range and pairwise nature (quadratic computa- 
tional complexity). An additional problem to solve in 
concurrence with electrostatic interactions is the self- 
consistent treatment of the ionization states of the ligand 
and the protein and the interdependency of the pK^ values 
evaluation (see next section). We have long-term experi- 
ence with protein electrostatics and its algorithmic imple- 
mentation, so we avidly look for new ways to improve 
both accuracy and computational efficiency. In this 
work, we offer several improvements based on the fast 
multipole forniahsm and its efficient parallelization 
within the C++/OpenCL environment. A natural exten- 
sion is to follow the Fourier representation of the previous 
section, i.e. utilization of a polynomial expansion to 
encode the electrostatic potential field and charge distri- 
bution of the protein macromolecule and the ligand small 
molecule. Note that this case requires pre-computed elec- 
trostatic field and charge distribution (which is stfll a good 
approximation relevant to standard formal treatment of 
electrostatics). Then, the pH-dependent electrostatic 
energy of a protein complex can be expressed as a 
multiple integral of converged electrostatic potential dis- 
tribution of the protein molecule and the charge distribu- 
tion of the ligand molecule. The electrostatic potential 
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computation is performed via multipole expansion {N log 
N computational complexity). 



M'" 



(3) 



;i=0 in=~n 



where define the point to calculate electrostatic po- 
tential, M^J are the moments of expansion and Y"^{0,<p) is 
the spherical harmonic of degree n and order m. 

To apply grid-free correlation, the electrostatic poten- 
tial is represented as an expansion of spherical polar 
function basis functions. Again, the orthogonahty 
property gives the overlap of spherical polar functions as 
a scalar product of the expansion coefficients. This con- 
venient formalism gives us the tool to express electrostatic 
energy as a scalar product of transformed expansion co- 
efficients for converged electrostatic potential distribution 
after a converged self-consistent procedure R^^,. of protein 
and the charge distribution of the ligand molecule L\^,^: 



E{pH) = KdY,R%.,(pH)L%,(pH)] 



(4) 



However, if we want to go beyond pre-computed elec- 
trostatics, we have to correlate protein electrostatic fields 
after a self-consistent iterative procedure, which can be 
applied at every sampling step. Due to the availabihty of 
modern GPU supercomputing resources, this branch of 
the docking workflow can be performed in real time. In 
this case, we implemented FMM, which accelerates the 
multipole method via clever techniques to shift multipole 
expansions and get local representations. The improve- 
ments lead to hnear 0(N) computational complexity. 
Our implementation is in C++/OpenCL, which is a 
novel feature that we would like to provide for practically 
inchned bioinformaticians who need real-time results. So, 
we reached the point where exposition of the next theme is 
naturally required. 

Interdependency of electrostatic fields and ^K^ estimation 
for docking partners 

The interdependency of protonation equilibria should be 
held in perfect balance as is the case for mutually inter- 
locking parenthetical structure. A major point is the 
mutual influence of the docking partners. Such a calcula- 
tion requires a separate self-consistent electrostatics run 
which includes mutual effect of docking partners on 
each other ionization sites and hence proton equilibria. 
In this case, we implement an additional kernel to 
achieve performance adequate for real-time simulation. 

The model accepts experimentally measured p^^ of 
model compounds (e.g. A^-acetyl amides of each rth 
ionogenic amino acids) (pATmod,/) and evaluates Born 
term — a linear response approximation. Partial charges 
assume values from molecular mechanics parameteriza- 
tion sets— AMBER (CHARMM is supported too). 

The pairwise interaction between any rth and /th ionic 
groups can be simulated by an empirical three-term 
function: Wjj (r, «/,) = iok/i'i/)i k = 3. The a/, values 
are estimated by a non-linear procedure for best fit to 



experimental data reflecting electrostatic interactions in 
proteins. 

At a stage before accounting for ionization, the proced- 
ure calculates intrinsic constants: pKmf / = p^mod,( + 
Ap^Born,( + ApA^par,,, whcrc pA^mod.i is the pK^ of the rth 
site according to model compounds, ApK^omj is the Born 
self-energy of the rth and ApKpar,,- is the contribution of 
the rth site interacting with the set of partial (permanent, 
fixed) atomic charges. For each protonation group and at 
each step of the iterative self-consistent method, we 
estimate the pA^a shift of the rth site caused by interactions 
with all other proton-binding groups. Here, the focus is on 
the interpretation of the Tanford-Roxby pK^, value as an 
average measure to describe the energy required to 
protonate individual site at a given pH: 



<pKj>= ^pK^^S2^,,, 



(5) 



where ^2(p) is the distribution function of the proton- 
ation states and the form of f2(p) that minimizes G is the 
equilibrium distribution function of the system, and pK^j 
is the pK value of group / in microscopic state /x. 

This Tanford-Roxby style procedure is a well-controlled 
approximation of the strict statistical mechanics treat- 
ment. We would like to write down the exact expression 
(derivation can be found at the Supplementary section of 
the Quantum. Ligand. Dock Server): 



<pKj>-- 















2M 













(6) 



Here, p is the protonation vector, G is the free energy of 
the corresponding ionization state, M is the number of 
proton-binding groups and E is the site-site electrostatic 
interaction energy. This relation can be derived in reverse 
order starting from the canonical Tanford-Roxby 
equation by trivial substitutions. 

When the self-consistent iterative procedure meets con- 
vergence criteria, the new charge distribution is applied for 
calculation of the electrostatic potential grid. It is at this 
point that we have accelerated the code by applying C++/ 
OpenCL implementation of the FMM. A multilevel sum- 
mation technique was also tested but fast multipole 
algorithms achieved higher performance. A brief expos- 
ition of fast multipole application can be found in the 
Implementation section. 

The Ways of Quantum.Ligand.Dock 

Quantum. Ligand. Dock server workflow allows access to 
several approaches of increasing detail and sophistication 
in exploring protein-ligand docking mechanism — in 
analogy to our protein-protein workflow (23). All of 
them take into account at different levels subtle issues in 
accounting for ionization states — appropriate treatment 
of pH dependence and protonation states self-consistence. 
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Upon coming at a stage to evaluate electrostatic inter- 
actions of the charge system and face the contribution of 
protonation-dependent electrostatics to correlation func- 
tions, Quantum. Ligand. Dock server provides three alter- 
natives to cope with the diverse needs and specific 
requirements for electrostatic docking calculation by the 
protein scientist: 

(1) A standard, straightforward method that relies on 
simple Coloumb electrostatics and immutable fields. 
This is the fastest approach. Each samphng step uses 
a pre-computed electrostatic field. 

(2) A step towards improvement — still immutable field 
at each step but a preliminary computation is per- 
formed via self-consistent iterative electrostatics. 
Thus, we have a converged protonation charge dis- 
tribution after the iterative procedure for a given pH 
value but no update at each sampling step. 

(3) Mutual electrostatic influence of the docking 
partners. We consider this step an essential and 
crucial contribution to the docking algorithms 
field — both for the protein-protein docking (23) 
and the current application to the protein-ligand 
case. Each sampling step in the 6D docking space 
requires reevaluation of electrostatic potential and 
reassignment of protonation charges. 

Whatever mode for calculation is chosen, the user can 
define a range of pH values to 'titrate' docking results. The 
user is provided with interactive Jmol Java applet to view 
docked structures. The results are also available as PDB- 
formatted complexes enlisted according to the docking 
score. The user can download aU predictions in 
NMR/MODEL PDB format as well as archives of differ- 
ently numbered sets of single PDB files. Such type of 
output can be readily used for visualization using conveni- 
ent molecular modelhng software for rendering protein 
3D structure— Chimera (32), VMD (33), etc. The final 
pages of the Quantum. Ligand. Dock workflows provide 
interactive visualization for each of the predicted 
complexes. 



IMPLEMENTATION 

The first note related to implementation is our wish to 
mention and accent the novel features related to our 
previous protein-protein docking realization. These are 
the efficient FMM for estimation of electrostatics 
(OpenCL), more stringent summation algorithm for ion- 
ization states (OpenCL) and the quantum entanglement 
contribution (OpenCL). In general, algorithms imple- 
menting docking methods (FFT correlation), electro- 
statics modeling, quantum effects estimation and protein 
structure handhng are written in C/C++/CUDA (some 
improvements in the parallel code are realized in 
OpenCL), Perl and Haskell by the author. C/C++/ 
CUDA/OpenCL environment is used to code computa- 
tionally demanding algorithms, which are the bottleneck 
in computing time. The heart of the acceleration is 
composed of GPU kernels. GPU supercomputers are 
based on massively parallel and multithreaded hardware 



architecture and thus achieve their limit with fine-grained 
parallel decompositions. As mentioned but stiU worth 
noting, our application of GPU parallelization is at the 
stages of long-range pairwise electrostatic calculation, the 
evaluation of the complementarity correlations by Fourier 
Transforms — FFT algorithm and the quantum entangle- 
ment contribution. The direct approach for electrostatics 
grid estimation is of quadratic time complexity O(nin) for 
n charge sites and m grid points. Our GPU kernel gave 
several tens fold speedup over a single core Central 
Processing Unit (CPU). Kernel development for electro- 
static potential distribution via direct summation is 
straightforwardly parallelized (actually the outer loop of 
the serial implementation). It is worth to note significant 
improvements based on the fast multipole formahsm and 
its efficient parallelization within the C++/OpenCL envir- 
onment. FMMs are amenable for efficient parallel imple- 
mentation and their computational complexity is hnear 
0(N). Nowadays, they are proved to be the most efficient 
methods in the class of hierarchical N-body approaches. 
The FMM idea works as follows: A region of the system 
transmits its far field expansion to other regions. There are 
several steps. At first particle-to-multipole (P2M) expan- 
sion is performed. Then follow multipole-to-multipole 
(M2M) expansion, muhipole-to-local (M2L), 
local-to-local expansion (L2L), local-to-particle expansion 
(L2P) and particle-to-particle (P2P) expansion. Technical 
details of the implementation and the benchmark of the 
performance can be found in the Benchmark and 
Supplement sections of the Quantum. Ligand. Dock server. 

For the bottleneck of the docking run — the Fourier 
Transform — we make use of the FFT algorithm 
provided by CUFFT hbrary (a CUD A implementation). 
Our method relies on multiple ID FFTs instead of a 3D 
FFT. 

'Perf excels at efficient and elegant protein structure 
parsing, parsing parametrization sets and convenient 
data structure manipulation. The web implementation 
itself is driven by 'CGI/PERL' routines with 'Java' 
employed to run molecular viewer for interactive visual- 
ization of dipole/electric moments relative to 3D protein 
structure. The Java applet is part of Jmol applet molecu- 
lar viewer distribution (http://jmol.sourceforge.net). 
Quantum. Ligand. Dock server expects as an input two co- 
ordinate files in PDB format — both protein structure and 
hgand are supposed to be PDB formatted. Protein struc- 
ture files containing HETATM records are given special 
attention — an option is present to account for additional 
user-defined parametrization of charge properties exph- 
citly in the electrostatic interaction calculation. As an add- 
itional asset, the user is given relevant information about 
the protein molecule and warned about certain 
inconsistencies in protein structure that might impact ad- 
versely ensuing calculation, e.g. interruption in residue 
numbering, which influences electrostatics through the ap- 
pearance of terminal amino positive and carboxy negative 
charge sites with intrinsic PA'S. The user is given the pos- 
sibility to edit initial setup of ionogenic groups (attention 
to cysteine residues in disulfide bonds and excluding co- 
valently modified groups). This is accomphshed by 
user-friendly panel selection of ionizable groups that are 
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going to be accounted for in the consequent self-consistent 
electrostatic calculation, alleviating the efforts of the user 
to customize input protein structure. Direct edit of PDB 
file aUows for a range of options aimed at the advanced 
user: adding missing terminal charges, fixed (non- 
titratable) integer or partial charges and titratable 
groups with user-defined pK^ intrinsic. We consider 
such rich electrostatic setup a significant practical boost 
for our Quantum. Ligand. Dock server. Reasonably 
acquainted users could address a number of subtle 
issues, e.g. effects of ligands, cofactors, inhibitors and 
ions. All other parameters used as input are predefined 
or automatically calculated. These steps complete the 
initial setup. Calculation proceeds through aforemen- 
tioned stages — evaluation of solvent access — ibihties and 
the Hnear response Born term ApA^Bom,;, perturbation of 
pA^a by partial charges ApA^par,, and finally the iterative 
procedure for self-consistent evaluation of titratable 

In accord with our previous implementation, we sample 
rotation-translation space with the following default 
values. First, we sample a 6D space — 1 translational and 
5 rotational degrees of freedom. The traditional samphng 
is also 6D but consists of 3 rotational and 3 translational 
degrees of freedom. The samphng step for translation is 
0.9 A; rotational steps = 6 angular degrees. Default poly- 
nomial expansion order is 20. The total number of mutual 
orientations of docking partners in sampling is in the 
order of billions — 10^. 

Just for reminder to estimate and compare electrostatic 
energies and potentials, the following energy conversion 
units were used: 1 kcal = 4.186kJ = 1.68 RT units (at 
298 K) = 0.735 ^K.^ units. The units of (Pi(pH) in kcal/ 
mol"e is equal to 43.176 mV or 30.24 mC/ni". 

Quantum entanglement calculations are described in 
detail at the Benchmark and Supplement pages at the 
Quantum. Ligand. Dock Server. 



BENCHMARKS AND EXTENSIVE TESTS 

In resume, computational bottlenecks appear at FFT- 
based algorithms, protein electrostatics treatment with 
FMMs, proton equilibria summation algorithms and 
quantum entanglement contribution to the docking 
score. However, emergence of extremely powerful GPU 
parallel architectures gives the possibihty to present the 
service to the wide protein community — from the acconi- 
phshed protein docking experts and adept structural 
bioinformaticians to the novice systems biology practi- 
tioners. Approaches outhned above were applied to a 
benchmark collection of protein-hgand interactions (see 
corresponding table uploaded at Quantum. Ligand. Dock 
server site Supplement page). Extensive tests for rehabihty 
and accuracy on standard benchmarks were performed as 
weU as comparative analysis in relation to other docking 
algorithms. However, direct comparison with other 
docking algorithms should be careful. It is not trivial to 
compare objectively different docking methods. Besides 
search problems of equal complexity, the algorithms 
must be compared under the conditions of equal 



Table 1. Prediction performace for Astex diverse set (34,36) 





For RMSn 


For RMSn 




<2A (%) 


< 1 A (%) 


Quantum. Ligand. Dock 


78 


59 


AutoDock (35) 


81.7 


NA 


ChemPLP" (34,36) 


81 


59 


GoldScore" (34,36) 


69 


50 


CliemScore" (34,36) 


76 


48 


ASP" (36) 


72 


44 


LigDockCSA (35) 


84.7 


NA 


MolGroVirtual Docker (37) 


74 


NA 



"http://www.ccdc.cam.ac.uk/case_studies/life_science/workcase 
posepred.pdf (Cambridge crystallographic data centre). 



running times to produce docking solutions. Thus, it is 
not straightforward to draw conclusions of general applic- 
abihty. One should take into account the difference in 
scoring functions, the strategy for samphng search space, 
the step parameter for the search, etc. Our approach is 
comparable with Hex at the level of representation and 
samphng the search space (spherical polar Fourier repre- 
sentation). The core of the acceleration is the samphng of 
the mutual orientation space and the Supplement section 
contains a table with Quantum. Ligand. Dock (milhons of 
orientations per second) speeds of sampling compared 
with one of our previous GPU. proton. DOCK realizations 
and at same time against Hex performance for different 
polynomial expansion orders. However, inclusion of 
sophisticated treatment of electrostatics and protonation 
equilibria makes direct comparisons in speed inconsistent. 

On the other hand, the reliability (accuracy) of predic- 
tion can be described in terms of root mean square devi- 
ation (RMSD) score. We have extensively tested the 
predictive performance of our method on several 
popular standard benchmark test sets. Comparison with 
the predictive ability of other methods is also presented. 
Here, we report (Table 1) the predictive performance of 
Quantum. Ligand. Dock against the modern 'Astex diverse 
set' (34), which consists of 85 protein-ligand complexes. 
The predictive performance within 2 A RMSD from the 
experimentally defined structures is 78% of the test cases. 
Upon dropping the quantum contribution, the predictive 
performance also drops to 65%. Tests with this bench- 
mark using AutoDock gives predictive performance 
81.7% (35). 

Another popular test benchmark set is the 'Ligand 
Protein DataBase' (37). Our tests showed prediction 
within 2 A RMSD for 72% (Quantum.Ligand.Dock) 
and 67% (without quantum corrections). The same bench- 
mark is used to test SwissDock predictive accuracy — 
70% (21). 

It seems that treatment of subtle aspects of protein- 
hgand interaction physics contributes to the reliabihty of 
docking methods. Although computationally demanding, 
the method stiU falls in the category 'ultrafast', and our 
intentions are to apply it in large-scale systems biology/ 
structural bioinformatics projects. For the contemporary 
status of docking accuracy, Quantum.Ligand.Dock is 
adequate and consistent. 
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CONCLUSION AND FUTURE DEVELOPMENT 

We have the confidence that Quantum. Ligand. Dock 
server will be of high interest and practical utility for a 
wide range of scientists — molecular biophysics and bio- 
informatics experts. Concurrently, it is exciting that the 
unique account of novel features will reveal as yet un- 
charted possibilities for prediction, analysis and explan- 
ation of protein-ligand interactions. However, our 
development effort continues towards novel functionality 
and methodological improvements. 

• Sophistication of quantum effects treatment (38) 

• Ehciting interplay of dipole/electric moments in 
protein-hgand recognition 

• Explicit modelhng of water molecules effect on 
docking 

• Applications in virtual screening context 

• Striving for development of novel high-performance 
treatment of electrostatics 
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